{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Get embeddings from dataset 本地的数据集embeddings处理，向量化\n",
    "\n",
    "This notebook gives an example on how to get embeddings from a large dataset.\n",
    "\n",
    "\n",
    "## 1. Load the dataset  加载数据集\n",
    "\n",
    "The dataset used in this example is [fine-food reviews](https://www.kaggle.com/snap/amazon-fine-food-reviews) from Amazon. The dataset contains a total of 568,454 food reviews Amazon users left up to October 2012. We will use a subset of this dataset, consisting of 1,000 most recent reviews for illustration purposes. The reviews are in English and tend to be positive or negative. Each review has a ProductId, UserId, Score, review title (Summary) and review body (Text).\n",
    "\n",
    "We will combine the review summary and review text into a single combined text. The model will encode this combined text and it will output a single vector embedding."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To run this notebook, you will need to install: pandas, openai, transformers, plotly, matplotlib, scikit-learn, torch (transformer dep), torchvision, and scipy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import tiktoken\n",
    "\n",
    "from utils.embedings_utils import get_embedding"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "embedding_model = \"text-embedding-3-small\"\n",
    "embedding_encoding = \"cl100k_base\"\n",
    "max_tokens = 8000  # the maximum for text-embedding-3-small is 8191"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": "         Time   ProductId          UserId  Score  \\\n0  1351123200  B003XPF9BO  A3R7JR3FMEBXQB      5   \n1  1351123200  B003JK537S  A3JBPC3WFUT5ZP      1   \n\n                                             Summary  \\\n0  where does one  start...and stop... with a tre...   \n1                                  Arrived in pieces   \n\n                                                Text  \\\n0  Wanted to save some to bring to my Chicago fam...   \n1  Not pleased at all. When I opened the box, mos...   \n\n                                            combined  \n0  Title: where does one  start...and stop... wit...  \n1  Title: Arrived in pieces; Content: Not pleased...  ",
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Time</th>\n      <th>ProductId</th>\n      <th>UserId</th>\n      <th>Score</th>\n      <th>Summary</th>\n      <th>Text</th>\n      <th>combined</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>1351123200</td>\n      <td>B003XPF9BO</td>\n      <td>A3R7JR3FMEBXQB</td>\n      <td>5</td>\n      <td>where does one  start...and stop... with a tre...</td>\n      <td>Wanted to save some to bring to my Chicago fam...</td>\n      <td>Title: where does one  start...and stop... wit...</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>1351123200</td>\n      <td>B003JK537S</td>\n      <td>A3JBPC3WFUT5ZP</td>\n      <td>1</td>\n      <td>Arrived in pieces</td>\n      <td>Not pleased at all. When I opened the box, mos...</td>\n      <td>Title: Arrived in pieces; Content: Not pleased...</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# load & inspect dataset  加载检查数据集\n",
    "input_datapath = \"data/fine_food_reviews_1k.csv\"  # to save space, we provide a pre-filtered dataset\n",
    "df = pd.read_csv(input_datapath, index_col=0)\n",
    "df = df[[\"Time\", \"ProductId\", \"UserId\", \"Score\", \"Summary\", \"Text\"]]\n",
    "df = df.dropna() #删除包含缺失值的所有行\n",
    "#合数据集：新增一个字段combined列，由Title和Content合并而成\n",
    "df[\"combined\"] = (\n",
    "    \"Title: \" + df.Summary.str.strip() + \"; Content: \" + df.Text.str.strip()\n",
    ")\n",
    "df.head(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": "1000"
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# subsample to 1k most recent reviews and remove samples that are too long\n",
    "# 取出1000条最新的评论记录，如果超过了token阀值则移除\n",
    "top_n = 1000\n",
    "#按Time字段进行排序\n",
    "df = df.sort_values(\"Time\").tail(top_n * 2)  # first cut to first 2k entries, assuming less than half will be filtered out\n",
    "#删除Time字段，后面不需要使用到Time\n",
    "df.drop(\"Time\", axis=1, inplace=True)\n",
    "\n",
    "encoding = tiktoken.get_encoding(embedding_encoding)\n",
    "\n",
    "# omit reviews that are too long to embed\n",
    "#将得到的编码embedding剔除超长的，放入n_tokens列\n",
    "df[\"n_tokens\"] = df.combined.apply(lambda x: len(encoding.encode(x)))\n",
    "df = df[df.n_tokens <= max_tokens].tail(top_n)\n",
    "len(df)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Get embeddings and save them for future reuse"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "ename": "RateLimitError",
     "evalue": "Error code: 429 - {'error': {'message': 'Rate limit reached for text-embedding-3-small in organization org-vQCmbcqeaYFVL7mCXJnFGpTo on requests per day (RPD): Limit 200, Used 200, Requested 1. Please try again in 7m12s. Visit https://platform.openai.com/account/rate-limits to learn more. You can increase your rate limit by adding a payment method to your account at https://platform.openai.com/account/billing.', 'type': 'requests', 'param': None, 'code': 'rate_limit_exceeded'}}",
     "output_type": "error",
     "traceback": [
      "\u001B[1;31m---------------------------------------------------------------------------\u001B[0m",
      "\u001B[1;31mRateLimitError\u001B[0m                            Traceback (most recent call last)",
      "Cell \u001B[1;32mIn[5], line 6\u001B[0m\n\u001B[0;32m      1\u001B[0m \u001B[38;5;66;03m# Ensure you have your API key set in your environment per the README: https://github.com/openai/openai-python#usage\u001B[39;00m\n\u001B[0;32m      2\u001B[0m \n\u001B[0;32m      3\u001B[0m \u001B[38;5;66;03m# This may take a few minutes\u001B[39;00m\n\u001B[0;32m      4\u001B[0m \u001B[38;5;66;03m#新增一列embedding，存放通过lambda函数，获取到每一个联合列combined的值对应的embedding\u001B[39;00m\n\u001B[0;32m      5\u001B[0m \u001B[38;5;66;03m#另存到文件fine_food_reviews_with_embeddings_1k.csv，作为知识库备用\u001B[39;00m\n\u001B[1;32m----> 6\u001B[0m df[\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124membedding\u001B[39m\u001B[38;5;124m\"\u001B[39m] \u001B[38;5;241m=\u001B[39m \u001B[43mdf\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mcombined\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mapply\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;28;43;01mlambda\u001B[39;49;00m\u001B[43m \u001B[49m\u001B[43mx\u001B[49m\u001B[43m:\u001B[49m\u001B[43m \u001B[49m\u001B[43mget_embedding\u001B[49m\u001B[43m(\u001B[49m\u001B[43mx\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmodel\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43membedding_model\u001B[49m\u001B[43m)\u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m      7\u001B[0m df\u001B[38;5;241m.\u001B[39mto_csv(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mdata/fine_food_reviews_with_embeddings_1k.csv\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\pandas\\core\\series.py:4915\u001B[0m, in \u001B[0;36mSeries.apply\u001B[1;34m(self, func, convert_dtype, args, by_row, **kwargs)\u001B[0m\n\u001B[0;32m   4780\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21mapply\u001B[39m(\n\u001B[0;32m   4781\u001B[0m     \u001B[38;5;28mself\u001B[39m,\n\u001B[0;32m   4782\u001B[0m     func: AggFuncType,\n\u001B[1;32m   (...)\u001B[0m\n\u001B[0;32m   4787\u001B[0m     \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs,\n\u001B[0;32m   4788\u001B[0m ) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m DataFrame \u001B[38;5;241m|\u001B[39m Series:\n\u001B[0;32m   4789\u001B[0m \u001B[38;5;250m    \u001B[39m\u001B[38;5;124;03m\"\"\"\u001B[39;00m\n\u001B[0;32m   4790\u001B[0m \u001B[38;5;124;03m    Invoke function on values of Series.\u001B[39;00m\n\u001B[0;32m   4791\u001B[0m \n\u001B[1;32m   (...)\u001B[0m\n\u001B[0;32m   4906\u001B[0m \u001B[38;5;124;03m    dtype: float64\u001B[39;00m\n\u001B[0;32m   4907\u001B[0m \u001B[38;5;124;03m    \"\"\"\u001B[39;00m\n\u001B[0;32m   4908\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[43mSeriesApply\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   4909\u001B[0m \u001B[43m        \u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[43m,\u001B[49m\n\u001B[0;32m   4910\u001B[0m \u001B[43m        \u001B[49m\u001B[43mfunc\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   4911\u001B[0m \u001B[43m        \u001B[49m\u001B[43mconvert_dtype\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mconvert_dtype\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   4912\u001B[0m \u001B[43m        \u001B[49m\u001B[43mby_row\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mby_row\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   4913\u001B[0m \u001B[43m        \u001B[49m\u001B[43margs\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43margs\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   4914\u001B[0m \u001B[43m        \u001B[49m\u001B[43mkwargs\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m-> 4915\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mapply\u001B[49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\pandas\\core\\apply.py:1427\u001B[0m, in \u001B[0;36mSeriesApply.apply\u001B[1;34m(self)\u001B[0m\n\u001B[0;32m   1424\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mapply_compat()\n\u001B[0;32m   1426\u001B[0m \u001B[38;5;66;03m# self.func is Callable\u001B[39;00m\n\u001B[1;32m-> 1427\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mapply_standard\u001B[49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\pandas\\core\\apply.py:1507\u001B[0m, in \u001B[0;36mSeriesApply.apply_standard\u001B[1;34m(self)\u001B[0m\n\u001B[0;32m   1501\u001B[0m \u001B[38;5;66;03m# row-wise access\u001B[39;00m\n\u001B[0;32m   1502\u001B[0m \u001B[38;5;66;03m# apply doesn't have a `na_action` keyword and for backward compat reasons\u001B[39;00m\n\u001B[0;32m   1503\u001B[0m \u001B[38;5;66;03m# we need to give `na_action=\"ignore\"` for categorical data.\u001B[39;00m\n\u001B[0;32m   1504\u001B[0m \u001B[38;5;66;03m# TODO: remove the `na_action=\"ignore\"` when that default has been changed in\u001B[39;00m\n\u001B[0;32m   1505\u001B[0m \u001B[38;5;66;03m#  Categorical (GH51645).\u001B[39;00m\n\u001B[0;32m   1506\u001B[0m action \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mignore\u001B[39m\u001B[38;5;124m\"\u001B[39m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(obj\u001B[38;5;241m.\u001B[39mdtype, CategoricalDtype) \u001B[38;5;28;01melse\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m\n\u001B[1;32m-> 1507\u001B[0m mapped \u001B[38;5;241m=\u001B[39m \u001B[43mobj\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_map_values\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   1508\u001B[0m \u001B[43m    \u001B[49m\u001B[43mmapper\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcurried\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mna_action\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43maction\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mconvert\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mconvert_dtype\u001B[49m\n\u001B[0;32m   1509\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m   1511\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28mlen\u001B[39m(mapped) \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(mapped[\u001B[38;5;241m0\u001B[39m], ABCSeries):\n\u001B[0;32m   1512\u001B[0m     \u001B[38;5;66;03m# GH#43986 Need to do list(mapped) in order to get treated as nested\u001B[39;00m\n\u001B[0;32m   1513\u001B[0m     \u001B[38;5;66;03m#  See also GH#25959 regarding EA support\u001B[39;00m\n\u001B[0;32m   1514\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m obj\u001B[38;5;241m.\u001B[39m_constructor_expanddim(\u001B[38;5;28mlist\u001B[39m(mapped), index\u001B[38;5;241m=\u001B[39mobj\u001B[38;5;241m.\u001B[39mindex)\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\pandas\\core\\base.py:921\u001B[0m, in \u001B[0;36mIndexOpsMixin._map_values\u001B[1;34m(self, mapper, na_action, convert)\u001B[0m\n\u001B[0;32m    918\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(arr, ExtensionArray):\n\u001B[0;32m    919\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m arr\u001B[38;5;241m.\u001B[39mmap(mapper, na_action\u001B[38;5;241m=\u001B[39mna_action)\n\u001B[1;32m--> 921\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[43malgorithms\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mmap_array\u001B[49m\u001B[43m(\u001B[49m\u001B[43marr\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmapper\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mna_action\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mna_action\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mconvert\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mconvert\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\pandas\\core\\algorithms.py:1743\u001B[0m, in \u001B[0;36mmap_array\u001B[1;34m(arr, mapper, na_action, convert)\u001B[0m\n\u001B[0;32m   1741\u001B[0m values \u001B[38;5;241m=\u001B[39m arr\u001B[38;5;241m.\u001B[39mastype(\u001B[38;5;28mobject\u001B[39m, copy\u001B[38;5;241m=\u001B[39m\u001B[38;5;28;01mFalse\u001B[39;00m)\n\u001B[0;32m   1742\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m na_action \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m:\n\u001B[1;32m-> 1743\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[43mlib\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mmap_infer\u001B[49m\u001B[43m(\u001B[49m\u001B[43mvalues\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmapper\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mconvert\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mconvert\u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m   1744\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[0;32m   1745\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m lib\u001B[38;5;241m.\u001B[39mmap_infer_mask(\n\u001B[0;32m   1746\u001B[0m         values, mapper, mask\u001B[38;5;241m=\u001B[39misna(values)\u001B[38;5;241m.\u001B[39mview(np\u001B[38;5;241m.\u001B[39muint8), convert\u001B[38;5;241m=\u001B[39mconvert\n\u001B[0;32m   1747\u001B[0m     )\n",
      "File \u001B[1;32mlib.pyx:2972\u001B[0m, in \u001B[0;36mpandas._libs.lib.map_infer\u001B[1;34m()\u001B[0m\n",
      "Cell \u001B[1;32mIn[5], line 6\u001B[0m, in \u001B[0;36m<lambda>\u001B[1;34m(x)\u001B[0m\n\u001B[0;32m      1\u001B[0m \u001B[38;5;66;03m# Ensure you have your API key set in your environment per the README: https://github.com/openai/openai-python#usage\u001B[39;00m\n\u001B[0;32m      2\u001B[0m \n\u001B[0;32m      3\u001B[0m \u001B[38;5;66;03m# This may take a few minutes\u001B[39;00m\n\u001B[0;32m      4\u001B[0m \u001B[38;5;66;03m#新增一列embedding，存放通过lambda函数，获取到每一个联合列combined的值对应的embedding\u001B[39;00m\n\u001B[0;32m      5\u001B[0m \u001B[38;5;66;03m#另存到文件fine_food_reviews_with_embeddings_1k.csv，作为知识库备用\u001B[39;00m\n\u001B[1;32m----> 6\u001B[0m df[\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124membedding\u001B[39m\u001B[38;5;124m\"\u001B[39m] \u001B[38;5;241m=\u001B[39m df\u001B[38;5;241m.\u001B[39mcombined\u001B[38;5;241m.\u001B[39mapply(\u001B[38;5;28;01mlambda\u001B[39;00m x: \u001B[43mget_embedding\u001B[49m\u001B[43m(\u001B[49m\u001B[43mx\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmodel\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43membedding_model\u001B[49m\u001B[43m)\u001B[49m)\n\u001B[0;32m      7\u001B[0m df\u001B[38;5;241m.\u001B[39mto_csv(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mdata/fine_food_reviews_with_embeddings_1k.csv\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n",
      "File \u001B[1;32mD:\\aigc\\openai_learning\\练习实践\\openai_embeddings\\utils\\embedings_utils.py:22\u001B[0m, in \u001B[0;36mget_embedding\u001B[1;34m(text, model, **kwargs)\u001B[0m\n\u001B[0;32m     18\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21mget_embedding\u001B[39m(text: \u001B[38;5;28mstr\u001B[39m, model\u001B[38;5;241m=\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mtext-embedding-3-small\u001B[39m\u001B[38;5;124m\"\u001B[39m, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m List[\u001B[38;5;28mfloat\u001B[39m]:\n\u001B[0;32m     19\u001B[0m     \u001B[38;5;66;03m# replace newlines, which can negatively affect performance.\u001B[39;00m\n\u001B[0;32m     20\u001B[0m     text \u001B[38;5;241m=\u001B[39m text\u001B[38;5;241m.\u001B[39mreplace(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;130;01m\\n\u001B[39;00m\u001B[38;5;124m\"\u001B[39m, \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m \u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[1;32m---> 22\u001B[0m     response \u001B[38;5;241m=\u001B[39m client\u001B[38;5;241m.\u001B[39membeddings\u001B[38;5;241m.\u001B[39mcreate(\u001B[38;5;28minput\u001B[39m\u001B[38;5;241m=\u001B[39m[text], model\u001B[38;5;241m=\u001B[39mmodel, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs)\n\u001B[0;32m     24\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m response\u001B[38;5;241m.\u001B[39mdata[\u001B[38;5;241m0\u001B[39m]\u001B[38;5;241m.\u001B[39membedding\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\resources\\embeddings.py:113\u001B[0m, in \u001B[0;36mEmbeddings.create\u001B[1;34m(self, input, model, dimensions, encoding_format, user, extra_headers, extra_query, extra_body, timeout)\u001B[0m\n\u001B[0;32m    107\u001B[0m         embedding\u001B[38;5;241m.\u001B[39membedding \u001B[38;5;241m=\u001B[39m np\u001B[38;5;241m.\u001B[39mfrombuffer(  \u001B[38;5;66;03m# type: ignore[no-untyped-call]\u001B[39;00m\n\u001B[0;32m    108\u001B[0m             base64\u001B[38;5;241m.\u001B[39mb64decode(data), dtype\u001B[38;5;241m=\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mfloat32\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[0;32m    109\u001B[0m         )\u001B[38;5;241m.\u001B[39mtolist()\n\u001B[0;32m    111\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m obj\n\u001B[1;32m--> 113\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_post\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    114\u001B[0m \u001B[43m    \u001B[49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[38;5;124;43m/embeddings\u001B[39;49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[43m,\u001B[49m\n\u001B[0;32m    115\u001B[0m \u001B[43m    \u001B[49m\u001B[43mbody\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mmaybe_transform\u001B[49m\u001B[43m(\u001B[49m\u001B[43mparams\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43membedding_create_params\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mEmbeddingCreateParams\u001B[49m\u001B[43m)\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    116\u001B[0m \u001B[43m    \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mmake_request_options\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    117\u001B[0m \u001B[43m        \u001B[49m\u001B[43mextra_headers\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mextra_headers\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    118\u001B[0m \u001B[43m        \u001B[49m\u001B[43mextra_query\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mextra_query\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    119\u001B[0m \u001B[43m        \u001B[49m\u001B[43mextra_body\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mextra_body\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    120\u001B[0m \u001B[43m        \u001B[49m\u001B[43mtimeout\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mtimeout\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    121\u001B[0m \u001B[43m        \u001B[49m\u001B[43mpost_parser\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mparser\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    122\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    123\u001B[0m \u001B[43m    \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mCreateEmbeddingResponse\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    124\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:1208\u001B[0m, in \u001B[0;36mSyncAPIClient.post\u001B[1;34m(self, path, cast_to, body, options, files, stream, stream_cls)\u001B[0m\n\u001B[0;32m   1194\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21mpost\u001B[39m(\n\u001B[0;32m   1195\u001B[0m     \u001B[38;5;28mself\u001B[39m,\n\u001B[0;32m   1196\u001B[0m     path: \u001B[38;5;28mstr\u001B[39m,\n\u001B[1;32m   (...)\u001B[0m\n\u001B[0;32m   1203\u001B[0m     stream_cls: \u001B[38;5;28mtype\u001B[39m[_StreamT] \u001B[38;5;241m|\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m,\n\u001B[0;32m   1204\u001B[0m ) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m ResponseT \u001B[38;5;241m|\u001B[39m _StreamT:\n\u001B[0;32m   1205\u001B[0m     opts \u001B[38;5;241m=\u001B[39m FinalRequestOptions\u001B[38;5;241m.\u001B[39mconstruct(\n\u001B[0;32m   1206\u001B[0m         method\u001B[38;5;241m=\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mpost\u001B[39m\u001B[38;5;124m\"\u001B[39m, url\u001B[38;5;241m=\u001B[39mpath, json_data\u001B[38;5;241m=\u001B[39mbody, files\u001B[38;5;241m=\u001B[39mto_httpx_files(files), \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39moptions\n\u001B[0;32m   1207\u001B[0m     )\n\u001B[1;32m-> 1208\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m cast(ResponseT, \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mrequest\u001B[49m\u001B[43m(\u001B[49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mopts\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m)\u001B[49m)\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:897\u001B[0m, in \u001B[0;36mSyncAPIClient.request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    888\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21mrequest\u001B[39m(\n\u001B[0;32m    889\u001B[0m     \u001B[38;5;28mself\u001B[39m,\n\u001B[0;32m    890\u001B[0m     cast_to: Type[ResponseT],\n\u001B[1;32m   (...)\u001B[0m\n\u001B[0;32m    895\u001B[0m     stream_cls: \u001B[38;5;28mtype\u001B[39m[_StreamT] \u001B[38;5;241m|\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m,\n\u001B[0;32m    896\u001B[0m ) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m ResponseT \u001B[38;5;241m|\u001B[39m _StreamT:\n\u001B[1;32m--> 897\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    898\u001B[0m \u001B[43m        \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    899\u001B[0m \u001B[43m        \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    900\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    901\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    902\u001B[0m \u001B[43m        \u001B[49m\u001B[43mremaining_retries\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mremaining_retries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    903\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:973\u001B[0m, in \u001B[0;36mSyncAPIClient._request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    971\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retries \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m0\u001B[39m \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_should_retry(err\u001B[38;5;241m.\u001B[39mresponse):\n\u001B[0;32m    972\u001B[0m     err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mclose()\n\u001B[1;32m--> 973\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_retry_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    974\u001B[0m \u001B[43m        \u001B[49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    975\u001B[0m \u001B[43m        \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    976\u001B[0m \u001B[43m        \u001B[49m\u001B[43mretries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    977\u001B[0m \u001B[43m        \u001B[49m\u001B[43merr\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mresponse\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    978\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    979\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    980\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m    982\u001B[0m \u001B[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001B[39;00m\n\u001B[0;32m    983\u001B[0m \u001B[38;5;66;03m# to completion before attempting to access the response text.\u001B[39;00m\n\u001B[0;32m    984\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mis_closed:\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:1021\u001B[0m, in \u001B[0;36mSyncAPIClient._retry_request\u001B[1;34m(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)\u001B[0m\n\u001B[0;32m   1017\u001B[0m \u001B[38;5;66;03m# In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a\u001B[39;00m\n\u001B[0;32m   1018\u001B[0m \u001B[38;5;66;03m# different thread if necessary.\u001B[39;00m\n\u001B[0;32m   1019\u001B[0m time\u001B[38;5;241m.\u001B[39msleep(timeout)\n\u001B[1;32m-> 1021\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   1022\u001B[0m \u001B[43m    \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1023\u001B[0m \u001B[43m    \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1024\u001B[0m \u001B[43m    \u001B[49m\u001B[43mremaining_retries\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mremaining\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1025\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1026\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1027\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:973\u001B[0m, in \u001B[0;36mSyncAPIClient._request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    971\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retries \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m0\u001B[39m \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_should_retry(err\u001B[38;5;241m.\u001B[39mresponse):\n\u001B[0;32m    972\u001B[0m     err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mclose()\n\u001B[1;32m--> 973\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_retry_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    974\u001B[0m \u001B[43m        \u001B[49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    975\u001B[0m \u001B[43m        \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    976\u001B[0m \u001B[43m        \u001B[49m\u001B[43mretries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    977\u001B[0m \u001B[43m        \u001B[49m\u001B[43merr\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mresponse\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    978\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    979\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    980\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m    982\u001B[0m \u001B[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001B[39;00m\n\u001B[0;32m    983\u001B[0m \u001B[38;5;66;03m# to completion before attempting to access the response text.\u001B[39;00m\n\u001B[0;32m    984\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mis_closed:\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:1021\u001B[0m, in \u001B[0;36mSyncAPIClient._retry_request\u001B[1;34m(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)\u001B[0m\n\u001B[0;32m   1017\u001B[0m \u001B[38;5;66;03m# In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a\u001B[39;00m\n\u001B[0;32m   1018\u001B[0m \u001B[38;5;66;03m# different thread if necessary.\u001B[39;00m\n\u001B[0;32m   1019\u001B[0m time\u001B[38;5;241m.\u001B[39msleep(timeout)\n\u001B[1;32m-> 1021\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   1022\u001B[0m \u001B[43m    \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1023\u001B[0m \u001B[43m    \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1024\u001B[0m \u001B[43m    \u001B[49m\u001B[43mremaining_retries\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mremaining\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1025\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1026\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1027\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:973\u001B[0m, in \u001B[0;36mSyncAPIClient._request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    971\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retries \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m0\u001B[39m \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_should_retry(err\u001B[38;5;241m.\u001B[39mresponse):\n\u001B[0;32m    972\u001B[0m     err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mclose()\n\u001B[1;32m--> 973\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_retry_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    974\u001B[0m \u001B[43m        \u001B[49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    975\u001B[0m \u001B[43m        \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    976\u001B[0m \u001B[43m        \u001B[49m\u001B[43mretries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    977\u001B[0m \u001B[43m        \u001B[49m\u001B[43merr\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mresponse\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    978\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    979\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    980\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m    982\u001B[0m \u001B[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001B[39;00m\n\u001B[0;32m    983\u001B[0m \u001B[38;5;66;03m# to completion before attempting to access the response text.\u001B[39;00m\n\u001B[0;32m    984\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mis_closed:\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:1021\u001B[0m, in \u001B[0;36mSyncAPIClient._retry_request\u001B[1;34m(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)\u001B[0m\n\u001B[0;32m   1017\u001B[0m \u001B[38;5;66;03m# In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a\u001B[39;00m\n\u001B[0;32m   1018\u001B[0m \u001B[38;5;66;03m# different thread if necessary.\u001B[39;00m\n\u001B[0;32m   1019\u001B[0m time\u001B[38;5;241m.\u001B[39msleep(timeout)\n\u001B[1;32m-> 1021\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   1022\u001B[0m \u001B[43m    \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1023\u001B[0m \u001B[43m    \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1024\u001B[0m \u001B[43m    \u001B[49m\u001B[43mremaining_retries\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mremaining\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1025\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1026\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1027\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:935\u001B[0m, in \u001B[0;36mSyncAPIClient._request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    932\u001B[0m log\u001B[38;5;241m.\u001B[39mdebug(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mEncountered httpx.TimeoutException\u001B[39m\u001B[38;5;124m\"\u001B[39m, exc_info\u001B[38;5;241m=\u001B[39m\u001B[38;5;28;01mTrue\u001B[39;00m)\n\u001B[0;32m    934\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retries \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m0\u001B[39m:\n\u001B[1;32m--> 935\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_retry_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    936\u001B[0m \u001B[43m        \u001B[49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    937\u001B[0m \u001B[43m        \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    938\u001B[0m \u001B[43m        \u001B[49m\u001B[43mretries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    939\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    940\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    941\u001B[0m \u001B[43m        \u001B[49m\u001B[43mresponse_headers\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43;01mNone\u001B[39;49;00m\u001B[43m,\u001B[49m\n\u001B[0;32m    942\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m    944\u001B[0m log\u001B[38;5;241m.\u001B[39mdebug(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mRaising timeout error\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[0;32m    945\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m APITimeoutError(request\u001B[38;5;241m=\u001B[39mrequest) \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;21;01merr\u001B[39;00m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:1021\u001B[0m, in \u001B[0;36mSyncAPIClient._retry_request\u001B[1;34m(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)\u001B[0m\n\u001B[0;32m   1017\u001B[0m \u001B[38;5;66;03m# In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a\u001B[39;00m\n\u001B[0;32m   1018\u001B[0m \u001B[38;5;66;03m# different thread if necessary.\u001B[39;00m\n\u001B[0;32m   1019\u001B[0m time\u001B[38;5;241m.\u001B[39msleep(timeout)\n\u001B[1;32m-> 1021\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   1022\u001B[0m \u001B[43m    \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1023\u001B[0m \u001B[43m    \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1024\u001B[0m \u001B[43m    \u001B[49m\u001B[43mremaining_retries\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mremaining\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1025\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1026\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1027\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:973\u001B[0m, in \u001B[0;36mSyncAPIClient._request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    971\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m retries \u001B[38;5;241m>\u001B[39m \u001B[38;5;241m0\u001B[39m \u001B[38;5;129;01mand\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_should_retry(err\u001B[38;5;241m.\u001B[39mresponse):\n\u001B[0;32m    972\u001B[0m     err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mclose()\n\u001B[1;32m--> 973\u001B[0m     \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_retry_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m    974\u001B[0m \u001B[43m        \u001B[49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    975\u001B[0m \u001B[43m        \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    976\u001B[0m \u001B[43m        \u001B[49m\u001B[43mretries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    977\u001B[0m \u001B[43m        \u001B[49m\u001B[43merr\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mresponse\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    978\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    979\u001B[0m \u001B[43m        \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m    980\u001B[0m \u001B[43m    \u001B[49m\u001B[43m)\u001B[49m\n\u001B[0;32m    982\u001B[0m \u001B[38;5;66;03m# If the response is streamed then we need to explicitly read the response\u001B[39;00m\n\u001B[0;32m    983\u001B[0m \u001B[38;5;66;03m# to completion before attempting to access the response text.\u001B[39;00m\n\u001B[0;32m    984\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mis_closed:\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:1021\u001B[0m, in \u001B[0;36mSyncAPIClient._retry_request\u001B[1;34m(self, options, cast_to, remaining_retries, response_headers, stream, stream_cls)\u001B[0m\n\u001B[0;32m   1017\u001B[0m \u001B[38;5;66;03m# In a synchronous context we are blocking the entire thread. Up to the library user to run the client in a\u001B[39;00m\n\u001B[0;32m   1018\u001B[0m \u001B[38;5;66;03m# different thread if necessary.\u001B[39;00m\n\u001B[0;32m   1019\u001B[0m time\u001B[38;5;241m.\u001B[39msleep(timeout)\n\u001B[1;32m-> 1021\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[0;32m   1022\u001B[0m \u001B[43m    \u001B[49m\u001B[43moptions\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43moptions\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1023\u001B[0m \u001B[43m    \u001B[49m\u001B[43mcast_to\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcast_to\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1024\u001B[0m \u001B[43m    \u001B[49m\u001B[43mremaining_retries\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mremaining\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1025\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1026\u001B[0m \u001B[43m    \u001B[49m\u001B[43mstream_cls\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mstream_cls\u001B[49m\u001B[43m,\u001B[49m\n\u001B[0;32m   1027\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n",
      "File \u001B[1;32mD:\\Program Files (x86)\\python3\\lib\\site-packages\\openai\\_base_client.py:988\u001B[0m, in \u001B[0;36mSyncAPIClient._request\u001B[1;34m(self, cast_to, options, remaining_retries, stream, stream_cls)\u001B[0m\n\u001B[0;32m    985\u001B[0m         err\u001B[38;5;241m.\u001B[39mresponse\u001B[38;5;241m.\u001B[39mread()\n\u001B[0;32m    987\u001B[0m     log\u001B[38;5;241m.\u001B[39mdebug(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mRe-raising status error\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[1;32m--> 988\u001B[0m     \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_make_status_error_from_response(err\u001B[38;5;241m.\u001B[39mresponse) \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m\n\u001B[0;32m    990\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_process_response(\n\u001B[0;32m    991\u001B[0m     cast_to\u001B[38;5;241m=\u001B[39mcast_to,\n\u001B[0;32m    992\u001B[0m     options\u001B[38;5;241m=\u001B[39moptions,\n\u001B[1;32m   (...)\u001B[0m\n\u001B[0;32m    995\u001B[0m     stream_cls\u001B[38;5;241m=\u001B[39mstream_cls,\n\u001B[0;32m    996\u001B[0m )\n",
      "\u001B[1;31mRateLimitError\u001B[0m: Error code: 429 - {'error': {'message': 'Rate limit reached for text-embedding-3-small in organization org-vQCmbcqeaYFVL7mCXJnFGpTo on requests per day (RPD): Limit 200, Used 200, Requested 1. Please try again in 7m12s. Visit https://platform.openai.com/account/rate-limits to learn more. You can increase your rate limit by adding a payment method to your account at https://platform.openai.com/account/billing.', 'type': 'requests', 'param': None, 'code': 'rate_limit_exceeded'}}"
     ]
    }
   ],
   "source": [
    "#新增一列embedding，存放通过lambda函数，获取到每一个联合列combined的值对应的embedding\n",
    "#另存到文件fine_food_reviews_with_embeddings_1k.csv，作为知识库备用\n",
    "df[\"embedding\"] = df.combined.apply(lambda x: get_embedding(x, model=embedding_model))\n",
    "df.to_csv(\"data/fine_food_reviews_with_embeddings_1k.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [],
   "metadata": {
    "collapsed": false
   }
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = get_embedding(\"hi\", model=embedding_model)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "openai",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "365536dcbde60510dc9073d6b991cd35db2d9bac356a11f5b64279a5e6708b97"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
